Skip to content

Dynamic lora load/unload sidecar #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Nov 18, 2024

Conversation

coolkp
Copy link
Contributor

@coolkp coolkp commented Oct 23, 2024

Adding sidecar example for dynamically managing lora adapters on vllm server

Copy link

linux-foundation-easycla bot commented Oct 23, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Oct 23, 2024
@liu-cong
Copy link
Contributor

/assign

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 30, 2024
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 30, 2024
@coolkp
Copy link
Contributor Author

coolkp commented Oct 30, 2024

Thanks @liu-cong @guydc for reviewing, deep apologies for delay in responding. I had my notifications misconfigured, emails went to wrong place :(

Signed-off-by: Kunjan Patel <[email protected]>
Signed-off-by: Kunjan Patel <[email protected]>
Signed-off-by: Kunjan Patel <[email protected]>
…et changes, pull dynamically from configmap

Signed-off-by: Kunjan Patel <[email protected]>
@coolkp coolkp requested a review from ahg-g November 9, 2024 02:00
Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we placing this under examples?

Signed-off-by: Kunjan Patel <[email protected]>
Signed-off-by: Kunjan Patel <[email protected]>
Signed-off-by: Kunjan Patel <[email protected]>
@coolkp
Copy link
Contributor Author

coolkp commented Nov 11, 2024

why are we placing this under examples?
Where do you suggest we place it?
@ahg-g

@ahg-g
Copy link
Contributor

ahg-g commented Nov 12, 2024

why are we placing this under examples?
Where do you suggest we place it?
@ahg-g

perhaps under a tools directory?

Signed-off-by: Kunjan <[email protected]>
Signed-off-by: Kunjan <[email protected]>
@coolkp coolkp requested review from liu-cong and ahg-g November 12, 2024 18:46
coolkp and others added 2 commits November 16, 2024 11:49
@coolkp coolkp requested a review from ahg-g November 16, 2024 19:49
@ahg-g
Copy link
Contributor

ahg-g commented Nov 18, 2024

/lgtm
/approve

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 18, 2024
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 18, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, coolkp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 18, 2024
@k8s-ci-robot k8s-ci-robot merged commit 54ee6d7 into kubernetes-sigs:main Nov 18, 2024
2 checks passed
shaneutt pushed a commit to shaneutt/gateway-api-inference-extension that referenced this pull request Apr 18, 2025
kfswain pushed a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
* Dynamic lora load/unload sidecar

* Formatting

* Resolve README comments

Signed-off-by: Kunjan Patel <[email protected]>

* Address comments on sidecar, store updates in memory, rename base field

Signed-off-by: Kunjan Patel <[email protected]>

* Address comments in example deployment

Signed-off-by: Kunjan Patel <[email protected]>

* Address comments in example deployment

Signed-off-by: Kunjan Patel <[email protected]>

* base model is optional

Signed-off-by: Kunjan Patel <[email protected]>

* Check health of server before querying

Signed-off-by: Kunjan Patel <[email protected]>

* Check health of server before querying

Signed-off-by: Kunjan Patel <[email protected]>

* Docstrings

Signed-off-by: Kunjan Patel <[email protected]>

* Mock health check in tests

Signed-off-by: Kunjan Patel <[email protected]>

* Refactor configmap, switch to watchfiles to detect symbolic link target changes, pull dynamically from configmap

Signed-off-by: Kunjan Patel <[email protected]>

* Refactor configmap, switch to watchfiles to detect symbolic link target changes, pull dynamically from configmap

Signed-off-by: Kunjan Patel <[email protected]>

* Modify unittests

Signed-off-by: Kunjan Patel <[email protected]>

* Change example host and port to be explicit

Signed-off-by: Kunjan Patel <[email protected]>

* Change example sidecar name

Signed-off-by: Kunjan Patel <[email protected]>

* Add warning about using subPath

Signed-off-by: Kunjan Patel <[email protected]>

* Add screenshots

Signed-off-by: Kunjan Patel <[email protected]>

* Add screenshots

Signed-off-by: Kunjan Patel <[email protected]>

* Add testing results

Signed-off-by: Kunjan Patel <[email protected]>

* Add testing results

Signed-off-by: Kunjan Patel <[email protected]>

* Add config validation

Signed-off-by: Kunjan Patel <[email protected]>

* Add config documentation

Signed-off-by: Kunjan Patel <[email protected]>

* Add config documentation

Signed-off-by: Kunjan Patel <[email protected]>

* Add config validation

Signed-off-by: Kunjan Patel <[email protected]>

* Add config validation

Signed-off-by: Kunjan Patel <[email protected]>

* Make reconciling non blocking

* Move under tools

Signed-off-by: Kunjan <[email protected]>

* Move under tools

Signed-off-by: Kunjan <[email protected]>

* Document usage of sidecar, available by default from 1.29

* Document usage of sidecar, available by default from 1.29

* Document usage of sidecar, available by default from 1.29

Signed-off-by: Kunjan <[email protected]>

* Update tools/dynamic-lora-sidecar/README.md

Co-authored-by: Abdullah Gharaibeh <[email protected]>

* Update tools/dynamic-lora-sidecar/README.md

Co-authored-by: Abdullah Gharaibeh <[email protected]>

---------

Signed-off-by: Kunjan Patel <[email protected]>
Signed-off-by: Kunjan <[email protected]>
Co-authored-by: Abdullah Gharaibeh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants